A privacy-preserving solution for compressed storage and selective retrieval of genomic data.

نویسندگان

  • Zhicong Huang
  • Erman Ayday
  • Huang Lin
  • Raeka S Aiyar
  • Adam Molyneaux
  • Zhenyu Xu
  • Jacques Fellay
  • Lars M Steinmetz
  • Jean-Pierre Hubaux
چکیده

In clinical genomics, the continuous evolution of bioinformatic algorithms and sequencing platforms makes it beneficial to store patients' complete aligned genomic data in addition to variant calls relative to a reference sequence. Due to the large size of human genome sequence data files (varying from 30 GB to 200 GB depending on coverage), two major challenges facing genomics laboratories are the costs of storage and the efficiency of the initial data processing. In addition, privacy of genomic data is becoming an increasingly serious concern, yet no standard data storage solutions exist that enable compression, encryption, and selective retrieval. Here we present a privacy-preserving solution named SECRAM (Selective retrieval on Encrypted and Compressed Reference-oriented Alignment Map) for the secure storage of compressed aligned genomic data. Our solution enables selective retrieval of encrypted data and improves the efficiency of downstream analysis (e.g., variant calling). Compared with BAM, the de facto standard for storing aligned genomic data, SECRAM uses 18% less storage. Compared with CRAM, one of the most compressed nonencrypted formats (using 34% less storage than BAM), SECRAM maintains efficient compression and downstream data processing, while allowing for unprecedented levels of security in genomic data storage. Compared with previous work, the distinguishing features of SECRAM are that (1) it is position-based instead of read-based, and (2) it allows random querying of a subregion from a BAM-like file in an encrypted form. Our method thus offers a space-saving, privacy-preserving, and effective solution for the storage of clinical genomic data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attribute-based Access Control for Cloud-based Electronic Health Record (EHR) Systems

Electronic health record (EHR) system facilitates integrating patients' medical information and improves service productivity. However, user access to patient data in a privacy-preserving manner is still challenging problem. Many studies concerned with security and privacy in EHR systems. Rezaeibagha and Mu [1] have proposed a hybrid architecture for privacy-preserving accessing patient records...

متن کامل

Privacy-Preserving Content-Based Image Retrieval in the Cloud (Extended Version)

Storage requirements for visual data have been increasing in recent years, following the emergence of many new highly interactive multimedia services and applications for both personal and corporate use. This has been a key driving factor for the adoption of cloud-based data outsourcing solutions. However, outsourcing data storage to the Cloud also leads to new challenges that must be carefully...

متن کامل

Differentially Private Local Electricity Markets

Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...

متن کامل

Genomic Data Privacy Protection using Compressed Sensing

In this article, we present a privacy preserving genomic data dissemination algorithm based on compressed sensing. We participated in the challenge at the iDASH on March 24, 2014 in La Jolla, California and the result of the challenge are available online . In our proposed method, we are adding noise to the sparse representation of the input vector to make it differentially private. First, we f...

متن کامل

Study of privacy-preserving framework for cloud storage

In order to implement a privacy-preserving, efficient and secure data storage and access environment of cloud storage, the following problems must be considered: data index structure, generation and management of keys, data retrieval, treatments of change of users’ access right and dynamic operations on data, and interactions among participants. To solve those problems, the interactive protocol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 26 12  شماره 

صفحات  -

تاریخ انتشار 2016